# lake

![Black Withered Tree Surounded By Body Of Water](lake.jpg)
Photo by [Kyle Roxas](https://www.pexels.com/photo/black-withered-tree-surounded-by-body-of-water-2138922/)

The idea is to implement a private data "lake" on top of the [tantivy search](https://github.com/tantivy-search/tantivy) library. It could function as storage and API for all kinds of data with different frontends that create their own schema or support exisiting ones.

*This is in a very early developent stage and by no means usable right now.*

## General outline

- [ ] receive "documents" as JSON payload
- [ ] decide on a payload structure
- [ ] decide on a folder structure
- [ ] support for tags / facets
- [ ] create one sub folder per schema / index
- [ ] create schemas from JSON Schema
- [ ] translate existing schemas to JSON Schema
- [ ] check for duplication before adding data
- [ ] store binary data separately
- [ ] authentication
- [ ] predefined schemas for websites, images, documents?
- [ ] try to extract text data from eg PDFs?
